147 research outputs found

    Energy-Based Clustering: Fast and Robust Clustering of Data with Known Likelihood Functions

    Full text link
    Clustering has become an indispensable tool in the presence of increasingly large and complex data sets. Most clustering algorithms depend, either explicitly or implicitly, on the sampled density. However, estimated densities are fragile due to the curse of dimensionality and finite sampling effects, for instance in molecular dynamics simulations. To avoid the dependence on estimated densities, an energy-based clustering (EBC) algorithm based on the Metropolis acceptance criterion is developed in this work. In the proposed formulation, EBC can be considered a generalization of spectral clustering in the limit of large temperatures. Taking the potential energy of a sample explicitly into account alleviates requirements regarding the distribution of the data. In addition, it permits the subsampling of densely sampled regions, which can result in significant speed-ups and sublinear scaling. The algorithm is validated on a range of test systems including molecular dynamics trajectories of alanine dipeptide and the Trp-cage miniprotein. Our results show that including information about the potential-energy surface can largely decouple clustering from the sampling density

    Hybrid Classical/Machine-Learning Force Fields for the Accurate Description of Molecular Condensed-Phase Systems

    Full text link
    Electronic structure methods offer in principle accurate predictions of molecular properties, however, their applicability is limited by computational costs. Empirical methods are cheaper, but come with inherent approximations and are dependent on the quality and quantity of training data. The rise of machine learning (ML) force fields (FFs) exacerbates limitations related to training data even further, especially for condensed-phase systems for which the generation of large and high-quality training datasets is difficult. Here, we propose a hybrid ML/classical FF model that is parametrized exclusively on high-quality ab initio data of dimers and monomers in vacuum but is transferable to condensed-phase systems. The proposed hybrid model combines our previous ML-parametrized classical model with ML corrections for situations where classical approximations break down, thus combining the robustness and efficiency of classical FFs with the flexibility of ML. Extensive validation on benchmarking datasets and experimental condensed-phase data, including organic liquids and small-molecule crystal structures, showcases how the proposed approach may promote FF development and unlock the full potential of classical FFs

    Graph Convolutional Neural Networks for (QM)ML/MM Molecular Dynamics Simulations

    Full text link
    To accurately study chemical reactions in the condensed phase or within enzymes, both a quantum-mechanical description and sufficient configurational sampling is required to reach converged estimates. Here, quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations play an important role, providing QM accuracy for the region of interest at a decreased computational cost. However, QM/MM simulations are still too expensive to study large systems on longer time scales. Recently, machine learning (ML) models have been proposed to replace the QM description. The main limitation of these models lies in the accurate description of long-range interactions present in condensed-phase systems. To overcome this issue, a recent workflow has been introduced combining a semi-empirical method (i.e. density functional tight binding (DFTB)) and a high-dimensional neural network potential (HDNNP) in a Δ\Delta-learning scheme. This approach has been shown to be capable of correctly incorporating long-range interactions within a cutoff of 1.4 nm. One of the promising alternative approaches to efficiently take long-range effects into account is the development of graph convolutional neural networks (GCNN) for the prediction of the potential-energy surface. In this work, we investigate the use of GCNN models -- with and without a Δ\Delta-learning scheme -- for (QM)ML/MM MD simulations. We show that the Δ\Delta-learning approach using a GCNN and DFTB and as baseline achieves competitive performance on our benchmarking set of solutes and chemical reactions in water. The method is additionally validated by performing prospective (QM)ML/MM MD simulations of retinoic acid in water and S-adenoslymethioniat interacting with cytosine in water. The results indicate that the Δ\Delta-learning GCNN model is a valuable alternative for (QM)ML/MM MD simulations of condensed-phase systems

    Machine Learning in QM/MM Molecular Dynamics Simulations of Condensed-Phase Systems

    Full text link
    Quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations have been developed to simulate molecular systems, where an explicit description of changes in the electronic structure is necessary. However, QM/MM MD simulations are computationally expensive compared to fully classical simulations as all valence electrons are treated explicitly and a self-consistent field (SCF) procedure is required. Recently, approaches have been proposed to replace the QM description with machine learned (ML) models. However, condensed-phase systems pose a challenge for these approaches due to long-range interactions. Here, we establish a workflow, which incorporates the MM environment as an element type in a high-dimensional neural network potential (HDNNP). The fitted HDNNP describes the potential-energy surface of the QM particles with an electrostatic embedding scheme. Thus, the MM particles feel a force from the polarized QM particles. To achieve chemical accuracy, we find that even simple systems require models with a strong gradient regularization, a large number of data points, and a substantial number of parameters. To address this issue, we extend our approach to a delta-learning scheme, where the ML model learns the difference between a reference method (DFT) and a cheaper semi-empirical method (DFTB). We show that such a scheme reaches the accuracy of the DFT reference method, while requiring significantly less parameters. Furthermore, the delta-learning scheme is capable of correctly incorporating long-range interactions within a cutoff of 1.4 nm. It is validated by performing MD simulations of retinoic acid in water and the interaction between S-adenoslymethioniat with cytosine in water. The presented results indicate that delta-learning is a promising approach for (QM)ML/MM MD simulations of condensed-phase systems

    Solvating atomic level fine-grained proteins in supra-molecular level coarse-grained water for molecular dynamics simulations

    Get PDF
    Simulation of the dynamics of a protein in aqueous solution using an atomic model for both the protein and the many water molecules is still computationally extremely demanding considering the time scale of protein motions. The use of supra-atomic or supra-molecular coarse-grained (CG) models may enhance the computational efficiency, but inevitably at the cost of reduced accuracy. Coarse-graining solvent degrees of freedom is likely to yield a favourable balance between reduced accuracy and enhanced computational speed. Here, the use of a supra-molecular coarse-grained water model that largely preserves the thermodynamic and dielectric properties of atomic level fine-grained (FG) water in molecular dynamics simulations of an atomic model for four proteins is investigated. The results of using an FG, a CG, an implicit, or a vacuum solvent environment of the four proteins are compared, and for hen egg-white lysozyme a comparison to NMR data is made. The mixed-grained simulations do not show large differences compared to the FG atomic level simulations, apart from an increased tendency to form hydrogen bonds between long side chains, which is due to the reduced ability of the supra-molecular CG beads that represent five FG water molecules to make solvent-protein hydrogen bonds. But, the mixed-grained simulations are at least an order of magnitude faster than the atomic level one

    Free energy calculations offer insights into the influence of receptor flexibility on ligand-receptor binding affinities

    Get PDF
    Docking algorithms for computer-aided drug discovery and design often ignore or restrain the flexibility of the receptor, which may lead to a loss of accuracy of the relative free enthalpies of binding. In order to evaluate the contribution of receptor flexibility to relative binding free enthalpies, two host-guest systems have been examined: inclusion complexes of α-cyclodextrin (αCD) with 1-chlorobenzene (ClBn), 1-bromobenzene (BrBn) and toluene (MeBn), and complexes of DNA with the minor-groove binding ligands netropsin (Net) and distamycin (Dist). Molecular dynamics simulations and free energy calculations reveal that restraining of the flexibility of the receptor can have a significant influence on the estimated relative ligand-receptor binding affinities as well as on the predicted structures of the biomolecular complexes. The influence is particularly pronounced in the case of flexible receptors such as DNA, where a 50% contribution of DNA flexibility towards the relative ligand-DNA binding affinities is observed. The differences in the free enthalpy of binding do not arise only from the changes in ligand-DNA interactions but also from changes in ligand-solvent interactions as well as from the loss of DNA configurational entropy upon restrainin

    Free enthalpies of replacing water molecules in protein binding pockets

    Get PDF
    Water molecules in the binding pocket of a protein and their role in ligand binding have increasingly raised interest in recent years. Displacement of such water molecules by ligand atoms can be either favourable or unfavourable for ligand binding depending on the change in free enthalpy. In this study, we investigate the displacement of water molecules by an apolar probe in the binding pocket of two proteins, cyclin-dependent kinase 2 and tRNA-guanine transglycosylase, using the method of enveloping distribution sampling (EDS) to obtain free enthalpy differences. In both cases, a ligand core is placed inside the respective pocket and the remaining water molecules are converted to apolar probes, both individually and in pairs. The free enthalpy difference between a water molecule and a CH3 group at the same location in the pocket in comparison to their presence in bulk solution calculated from EDS molecular dynamics simulations corresponds to the binding free enthalpy of CH3 at this location. From the free enthalpy difference and the enthalpy difference, the entropic contribution of the displacement can be obtained too. The overlay of the resulting occupancy volumes of the water molecules with crystal structures of analogous ligands shows qualitative correlation between experimentally measured inhibition constants and the calculated free enthalpy differences. Thus, such an EDS analysis of the water molecules in the binding pocket may give valuable insight for potency optimization in drug desig

    Definition and testing of the GROMOS force-field versions 54A7 and 54B7

    Get PDF
    New parameter sets of the GROMOS biomolecular force field, 54A7 and 54B7, are introduced. These parameter sets summarise some previously published force field modifications: The 53A6 helical propensities are corrected through new φ/ψ torsional angle terms and a modification of the N-H, C=O repulsion, a new atom type for a charged −CH3 in the choline moiety is added, the Na+ and Cl− ions are modified to reproduce the free energy of hydration, and additional improper torsional angle types for free energy calculations involving a chirality change are introduced. The new helical propensity modification is tested using the benchmark proteins hen egg-white lysozyme, fox1 RNA binding domain, chorismate mutase and the GCN4-p1 peptide. The stability of the proteins is improved in comparison with the 53A6 force field, and good agreement with a range of primary experimental data is obtaine

    RE-EDS Using GAFF Topologies: Application to Relative Hydration Free-Energy Calculations for Large Sets of Molecules

    Full text link
    Free-energy differences between pairs of end-states can be estimated based on molecular dynamics (MD) simulations using standard pathway-dependent methods such as thermodynamic integration (TI), free-energy perturbation, or Bennett's acceptance ratio. Replica-exchange enveloping distribution sampling (RE-EDS), on the other hand, allows for the sampling of multiple end-states in a single simulation without the specification of any pathways. In this work, we use the RE-EDS method as implemented in GROMOS together with generalized AMBER force field (GAFF) topologies, converted to a GROMOS-compatible format with a newly developed GROMOS++ program amber2gromos, to compute relative hydration free energies for a series of benzene derivatives. The results obtained with RE-EDS are compared to the experimental data as well as calculated values from the literature. In addition, the estimated free-energy differences in water and in vacuum are compared to values from TI calculations carried out with GROMACS. The hydration free energies obtained using RE-EDS for multiple molecules are found to be in good agreement with both the experimental data and the results calculated using other free-energy methods. While all considered free-energy methods delivered accurate results, the RE-EDS calculations required the least amount of total simulation time. This work serves as a validation for the use of GAFF topologies with the GROMOS simulation package and the RE-EDS approach. Furthermore, the performance of RE-EDS for a large set of 28 end-states is assessed with promising results
    corecore